FIPIP: A novel fine-grained parallel partition based intra-frame prediction on heterogeneous many-core systems
نویسندگان
چکیده
Intra-frame prediction is an important time-consuming component of the widely used H.264/AVC encoder. To speed up prediction, one promising direction is to introduce parallelism and there have been many heterogeneous many-core based approaches proposed. But most of these approaches are limited by their use of highly irregular prediction formulas, which require significant amount of branch instructions. They only use coarse-grained parallel partition, which considers blocks or sub-region of images as parallel processing units. In this paper, by contrast, we propose a fine-grained intra-frame prediction approach based on parallel partition (FIPIP) and implement it on Graphics Processing Unit (GPU) based heterogeneous many-core systems. The approach is characterized by the following aspects. First, our approach takes individual pixels as parallel processing units, instead of blocks. Imposing pixel-level parallelism is capable of fully exploiting the computational power of heterogeneous GPU-based systems and hence tremendously reduces the encoding time. Second, we unify irregular prediction formulas in intra-frameprediction into awell-designeduniformone, andpropose a table-lookupmethod to efficiently perform intra-frame prediction. Our formula can eliminate unnecessary branch instructions by using a unified predictor array, which improves the efficiency of the fine-grained parallel partition significantly. Third, two optimized encoding orders assisted by an improved combined frame strategy are adopted to implement multi-level parallelism. Finally, an efficient self-synchronizing method is realized for finegrained task scheduling on heterogeneous CPU–GPU architecture. We apply FIPIP to encode a set of benchmark videos under varying conditions and compare it with other popular intra-frame prediction methods. Results show that FIPIP outperforms existing state-of-the-art work with speedups factor of 2–6. © 2016 Elsevier B.V. All rights reserved. ∗ Corresponding author. E-mail address:[email protected] (W. Jiang). http://dx.doi.org/10.1016/j.future.2016.05.009 0167-739X/© 2016 Elsevier B.V. All rights reserved. 2 W. Jiang et al. / Future Generation Computer Systems ( ) –
منابع مشابه
An Energy-efficient Parallel H.264/AVC Baseline Encoder on a Fine-grained Many-core System
The emerging many-core architecture provides a flexible solution for the rapid evolving multimedia applications demanding both high performance and high energy-efficiency. However, developing parallel multimedia applications that can efficiently harness and utilize manycore architectures is the key challenge for scalable computing. We contribute to this challenge by presenting a fully-parallel ...
متن کاملEnergy-efficient Fine-grained Many-core Architecture for Video and DSP Applications
Many-core processor architecture has become the most promising computer architecture. However, how to utilize the extra system performance for real applications such as video encoding is still challenging. This dissertation investigates architecture design, physical implementation and performance evaluation of a fine-grained many-core processor for advanced video coding with a focus on intercon...
متن کاملEfficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems
Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...
متن کاملA Modular and Parameterisable Classification of Algorithms
Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With this trend of parallel computing, it becomes increasingly difficult to decide on which architecture to run a certain application or algorithm. Additionally, it brings forth the problem of parallel programming, leading to the so-called software engineering crisis. In...
متن کاملTowards Fine-grained Spatial Partition for Wildfire Simulation
Tasks partitioning is one of the most challenging issues in large scale parallel simulations. The simulation performance may vary dramatically if different partition schemes are used. This paper presents a spatial partition method named fine-grained spatial partition for parallel simulation of large scale wildfire. The fine-grained spatial partition is inspired by the four color theorem. It div...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Future Generation Comp. Syst.
دوره 78 شماره
صفحات -
تاریخ انتشار 2018